Goto

Collaborating Authors

 Bardhaman


Integrating Linguistics and AI: Morphological Analysis and Corpus development of Endangered Toto Language of West Bengal

Guha, Ambalika, Saha, Sajal, Ballav, Debanjan, Mitra, Soumi, Chakraborty, Hritwick

arXiv.org Artificial Intelligence

Preserving linguistic diversity is necessary as every language offers a distinct perspective on the world. There have been numerous global initiatives to preserve endangered languages through documentation. This paper is a part of a project which aims to develop a trilingual (Toto-Bangla-English) language learning application to digitally archive and promote the endangered Toto language of West Bengal, India. This application, designed for both native Toto speakers and non-native learners, aims to revitalize the language by ensuring accessibility and usability through Unicode script integration and a structured language corpus. The research includes detailed linguistic documentation collected via fieldwork, followed by the creation of a morpheme-tagged, trilingual corpus used to train a Small Language Model (SLM) and a Transformer-based translation engine. The analysis covers inflectional morphology such as person-number-gender agreement, tense-aspect-mood distinctions, and case marking, alongside derivational strategies that reflect word-class changes. Script standardization and digital literacy tools were also developed to enhance script usage. The study offers a sustainable model for preserving endangered languages by incorporating traditional linguistic methodology with AI. This bridge between linguistic research with technological innovation highlights the value of interdisciplinary collaboration for community-based language revitalization.


Data-Driven Antenna Miniaturization: A Knowledge-Based System Integrating Quantum PSO and Predictive Machine Learning Models

Parvez, Khan Masood, Rahaman, Sk Md Abidar, Sichani, Ali Shiri

arXiv.org Artificial Intelligence

The rapid evolution of wireless technologies necessitates automated design frameworks to address antenna miniaturization and performance optimization within constrained development cycles. This study demonstrates a machine learning enhanced workflow integrating Quantum-Behaved Dynamic Particle Swarm Optimization (QDPSO) with ANSYS HFSS simulations to accelerate antenna design. The QDPSO algorithm autonomously optimized loop dimensions in 11.53 seconds, achieving a resonance frequency of 1.4208 GHz a 12.7 percent reduction compared to conventional 1.60 GHz designs. Machine learning models (SVM, Random Forest, XGBoost, and Stacked ensembles) predicted resonance frequencies in 0.75 seconds using 936 simulation datasets, with stacked models showing superior training accuracy (R2=0.9825) and SVM demonstrating optimal validation performance (R2=0.7197). The complete design cycle, encompassing optimization, prediction, and ANSYS validation, required 12.42 minutes on standard desktop hardware (Intel i5-8500, 16GB RAM), contrasting sharply with the 50-hour benchmark of PSADEA-based approaches. This 240 times of acceleration eliminates traditional trial-and-error methods that often extend beyond seven expert-led days. The system enables precise specifications of performance targets with automated generation of fabrication-ready parameters, particularly benefiting compact consumer devices requiring rapid frequency tuning. By bridging AI-driven optimization with CAD validation, this framework reduces engineering workloads while ensuring production-ready designs, establishing a scalable paradigm for next-generation RF systems in 6G and IoT applications.


Real-time classification of EEG signals using Machine Learning deployment

Chowdhuri, Swati, Saha, Satadip, Karmakar, Samadrita, Chanda, Ankur

arXiv.org Artificial Intelligence

The prevailing educational methods predominantly rely on traditional classroom instruction or online delivery, often limiting the teachers' ability to engage effectively with all the students simultaneously. A more intrinsic method of evaluating student attentiveness during lectures can enable the educators to tailor the course materials and their teaching styles in order to better meet the students' needs. The aim of this paper is to enhance teaching quality in real time, thereby fostering a higher student engagement in the classroom activities. By monitoring the students' electroencephalography (EEG) signals and employing machine learning algorithms, this study proposes a comprehensive solution for addressing this challenge. Machine learning has emerged as a powerful tool for simplifying the analysis of complex variables, enabling the effective assessment of the students' concentration levels based on specific parameters. However, the real-time impact of machine learning models necessitates a careful consideration as their deployment is concerned. This study proposes a machine learning-based approach for predicting the level of students' comprehension with regard to a certain topic. A browser interface was introduced that accesses the values of the system's parameters to determine a student's level of concentration on a chosen topic. The deployment of the proposed system made it necessary to address the real-time challenges faced by the students, consider the system's cost, and establish trust in its efficacy. This paper presents the efforts made for approaching this pertinent issue through the implementation of innovative technologies and provides a framework for addressing key considerations for future research directions.


A data balancing approach towards design of an expert system for Heart Disease Prediction

Karmakar, Rahul, Ghosh, Udita, Pal, Arpita, Dey, Sattwiki, Malik, Debraj, Sain, Priyabrata

arXiv.org Artificial Intelligence

Heart disease is a serious global health issue that claims millions of lives every year. Early detection and precise prediction are critical to the prevention and successful treatment of heart related issues. A lot of research utilizes machine learning (ML) models to forecast cardiac disease and obtain early detection. In order to do predictive analysis on "Heart disease health indicators " dataset. We employed five machine learning methods in this paper: Decision Tree (DT), Random Forest (RF), Linear Discriminant Analysis, Extra Tree Classifier, and AdaBoost. The model is further examined using various feature selection (FS) techniques. To enhance the baseline model, we have separately applied four FS techniques: Sequential Forward FS, Sequential Backward FS, Correlation Matrix, and Chi2. Lastly, K means SMOTE oversampling is applied to the models to enable additional analysis. The findings show that when it came to predicting heart disease, ensemble approaches in particular, random forests performed better than individual classifiers. The presence of smoking, blood pressure, cholesterol, and physical inactivity were among the major predictors that were found. The accuracy of the Random Forest and Decision Tree model was 99.83%. This paper demonstrates how machine learning models can improve the accuracy of heart disease prediction, especially when using ensemble methodologies. The models provide a more accurate risk assessment than traditional methods since they incorporate a large number of factors and complex algorithms.


CavDetect: A DBSCAN Algorithm based Novel Cavity Detection Model on Protein Structure

Adhikari, Swati, Roy, Parthajit

arXiv.org Artificial Intelligence

Cavities on the structures of proteins are formed due to interaction between proteins and some small molecules, known as ligands. These are basically the locations where ligands bind with proteins. Actual detection of such locations is all-important to succeed in the entire drug design process. This study proposes a Voronoi Tessellation based novel cavity detection model that is used to detect cavities on the structure of proteins. As the atom space of protein structure is dense and of large volumes and the DBSCAN (Density Based Spatial Clustering of Applications with Noise) algorithm can handle such type of data very well as well as it is not mandatory to have knowledge about the numbers of clusters (cavities) in data as priori in this algorithm, this study proposes to implement the proposed algorithm with the DBSCAN algorithm.


Deep Learning-based Sentiment Analysis of Olympics Tweets

Bandyopadhyay, Indranil, Karmakar, Rahul

arXiv.org Artificial Intelligence

Sentiment analysis (SA), is an approach of natural language processing (NLP) for determining a text's emotional tone by analyzing subjective information such as views, feelings, and attitudes toward specific topics, products, services, events, or experiences. This study attempts to develop an advanced deep learning (DL) model for SA to understand global audience emotions through tweets in the context of the Olympic Games. The findings represent global attitudes around the Olympics and contribute to advancing the SA models. We have used NLP for tweet pre-processing and sophisticated DL models for arguing with SA, this research enhances the reliability and accuracy of sentiment classification. The study focuses on data selection, preprocessing, visualization, feature extraction, and model building, featuring a baseline Na\"ive Bayes (NB) model and three advanced DL models: Convolutional Neural Network (CNN), Bidirectional Long Short-Term Memory (BiLSTM), and Bidirectional Encoder Representations from Transformers (BERT). The results of the experiments show that the BERT model can efficiently classify sentiments related to the Olympics, achieving the highest accuracy of 99.23%.


Color Channel Perturbation Attacks for Fooling Convolutional Neural Networks and A Defense Against Such Attacks

Kantipudi, Jayendra, Dubey, Shiv Ram, Chakraborty, Soumendu

arXiv.org Artificial Intelligence

The Convolutional Neural Networks (CNNs) have emerged as a very powerful data dependent hierarchical feature extraction method. It is widely used in several computer vision problems. The CNNs learn the important visual features from training samples automatically. It is observed that the network overfits the training samples very easily. Several regularization methods have been proposed to avoid the overfitting. In spite of this, the network is sensitive to the color distribution within the images which is ignored by the existing approaches. In this paper, we discover the color robustness problem of CNN by proposing a Color Channel Perturbation (CCP) attack to fool the CNNs. In CCP attack new images are generated with new channels created by combining the original channels with the stochastic weights. Experiments were carried out over widely used CIFAR10, Caltech256 and TinyImageNet datasets in the image classification framework. The VGG, ResNet and DenseNet models are used to test the impact of the proposed attack. It is observed that the performance of the CNNs degrades drastically under the proposed CCP attack. Result show the effect of the proposed simple CCP attack over the robustness of the CNN trained model. The results are also compared with existing CNN fooling approaches to evaluate the accuracy drop. We also propose a primary defense mechanism to this problem by augmenting the training dataset with the proposed CCP attack. The state-of-the-art performance using the proposed solution in terms of the CNN robustness under CCP attack is observed in the experiments. The code is made publicly available at \url{https://github.com/jayendrakantipudi/Color-Channel-Perturbation-Attack}.


An Improved Simulation Model for Pedestrian Crowd Evacuation

Muhammed, Danial A., Rashid, Tarik A., Alsadoon, Abeer, Bacanin, Nebojsa, Fattah, Polla, Mohammadi, Mokhtar, Banerjee, Indradip

arXiv.org Artificial Intelligence

This paper works on one of the most recent pedestrian crowd evacuation models, i.e., "a simulation model for pedestrian crowd evacuation based on various AI techniques", developed in late 2019. This study adds a new feature to the developed model by proposing a new method and integrating it with the model. This method enables the developed model to find a more appropriate evacuation area design, among others regarding safety due to selecting the best exit door location among many suggested locations. This method is completely dependent on the selected model's output, i.e., the evacuation time for each individual within the evacuation process. The new method finds an average of the evacuees' evacuation times of each exit door location; then, based on the average evacuation time, it decides which exit door location would be the best exit door to be used for evacuation by the evacuees. To validate the method, various designs for the evacuation area with various written scenarios were used. The results showed that the model with this new method could predict a proper exit door location among many suggested locations. Lastly, from the results of this research using the integration of this newly proposed method, a new capability for the selected model in terms of safety allowed the right decision in selecting the finest design for the evacuation area among other designs.